Relational Algebra for In-Database Process Mining
نویسندگان
چکیده
The execution logs that are used for process mining in practice are often obtained by querying an operational database and storing the result in a flat file. Consequently, the data processing power of the database system cannot be used anymore for this information, leading to constrained flexibility in the definition of mining patterns and limited execution performance in mining large logs. Enabling process mining directly on a database instead of via intermediate storage in a flat file therefore provides additional flexibility and efficiency. To help facilitate this ideal of in-database process mining, this paper formally defines a database operator that extracts the ‘directly follows’ relation from an operational database. This operator can both be used to do in-database process mining and to flexibly evaluate process mining related queries, such as: “which employee most frequently changes the ‘amount’ attribute of a case from one task to the next”. We define the operator using the well-known relational algebra that forms the formal underpinning of relational databases. We formally prove equivalence properties of the operator that are useful for query optimization and present time-complexity properties of the operator. By doing so this paper formally defines the necessary relational algebraic elements of a ‘directly follows’ operator, which are required for implementation of such an operator in a DBMS.
منابع مشابه
A Data Cube Algebra Engine for Data
M.L. Kersten, A.P.J.M. Siebes CWI, Amsterdam, The Netherlands M. Holsheimer , F. Kwakkel Data Distilleries, Amsterdam, The Netherlands Abstract On line data mining products, such as Data Surveyor, illustrate that an extensible architecture to accommodate a variety of mining algorithms and database interconnectivity is technically feasible. In this paper we describe the interaction between Data ...
متن کاملAn efficient classifier design integrating rough set and set oriented database operations
Feature subset selection and dimensionality reduction of data are fundamental and most explored area of research in machine learning and data mining domains. Rough set theory (RST) constitutes a sound basis for data mining, can be used at different phases of knowledge discovery process. In the paper, by integrating the concept of RST and relational algebra operations, a new attribute reduction ...
متن کاملAn efficient approach for effectual mining of relational patterns from multi-relational database
Data mining is an extremely challenging and hopeful research topic due to its well-built application potential and the broad accessibility of the massive quantities of data in databases. Still, the rising significance of data mining in practical real world necessitates ever more complicated solutions while data includes of a huge amount of records which may be stored in various tables of a rela...
متن کاملTowards Next Generation Business Process Model Repositories â•fi A Technical Perspective on Loading and Processing of Process Models
Business process management repositories manage large collections of process models ranging in the thousands. Additionally, they provide management functions like e.g. mining, querying, merging and variants management for process models. However, most current business process management repositories are built on top of relation database management systems (RDBMS) although this leads to performa...
متن کاملKDD – Knowledge Discovery in Databases
2 Database Management Systems 3 2.1 Three-Schema Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Organisation of an Integrated Database System . . . . . . . . . . . . . . . . . . . . 5 2.3 Hierarchical and Network Databases . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.4 Relational Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1706.08259 شماره
صفحات -
تاریخ انتشار 2017